Method and system for correlating an image capturing device to a human user for analyzing gaze information associated with cognitive performance

ABSTRACT

The present invention provides a method for a finalized processed image and related data to identify a spatial location of each pupil in the region. Each pupil is identified by a two-dimensional spatial coordinate. The method includes processing information associated with each pupil identified by the two-dimensional spatial coordinate to output a plurality of two-dimensional spatial coordinates, each of which is in reference to a time, in a two-dimensional space. The method then includes outputting a gaze information about the human user. The gaze information includes the two-dimensional spatial coordinates each of which is in reference to a time in a two-dimensional space.

CROSS-REFERENCE TO RELATED CASES

This application is a continuation of U.S. Ser. No. 16/712,986 filedDec. 12, 2019, now issued as U.S. Pat. No. 10,984,237, which is acontinuation in part of and claims priority to U.S. Ser. No. 15/809,880filed Nov. 10, 2017, now issued as U.S. Pat. No. 10,517,520 on Dec. 31,2019, which claims priority to U.S. Provisional Ser. No. 62/420,521filed Nov. 10, 2016 and that application is incorporated by referenceherein, for all purposes.

BACKGROUND

The present invention relates to methods and apparatus for diagnosingcognitive impairment of a subject. In particular, the present inventionrelates to methods and an apparatus for acquisition of eye movementdata. More particularly, the present invention provides methods and anapparatus for acquisition of eye movement data and gaze information.

According to embodiments of the present invention, techniques forprocessing information associated with eye movement using web basedimage-capturing devices is disclosed. Merely by way of example, theinvention can be applied to analysis of information for determiningcognitive performance of subjects.

Historically, recognition memory of a subject has been assessed throughconventional paper-pencil based task paradigms. Such tests typicallyoccur in a controlled environment (e.g. laboratory, doctor's office,etc.) under the guidance of a test administrator using expensive (e.g.$10 k-$80K) systems. Such tests also require the subject to travel tothe laboratory and spend over an hour preparing for and taking suchtests. Typically, a test administrator shows a series of visual stimulito subjects at a certain frequency and rate. After the exposure phasethe user waits for a time delay of over twenty-five minutes before thetest administrator tests the subject's recall of the visual stimuli. Inaddition to the visual stimuli and the test administrator, visualrecognition memory paradigms also require response sheets to facilitateadministrator scoring. Although effective, conventional paradigms areexpensive, cumbersome, and subjective.

From the above, it is seen that techniques for improving acquisition ofeye movement data are highly desired.

SUMMARY

According to the present invention, techniques for processinginformation associated with eye movement using web based image-capturingdevices. Merely by way of example, the invention can be applied toanalysis of information for determining cognitive performance ofsubjects.

In an example, the present invention provides a method for identifying afeature of an eye of a human user. The method includes initiating animage capturing device, such as a camera or other imaging device. In anexample, the image capturing device comprises a plurality of sensorsarranged in an array. In an example, the method includes capturing videoinformation from a facial region of the human user using the imagecapturing device. In an example, the video information is from a streamof video comprising a plurality of frames.

In an example, the method includes processing the video information toparse the video information into the plurality of images. In an example,each of the plurality of images has a time stamp from a first timestamp, a second time stamp, to an Nth time stamp, where N is greaterthan 10, or other number.

In an example, the method includes processing each of images to identifya location of the facial region and processing each of the images withthe location of the facial region to identify a plurality of landmarksassociated with the facial region. In an example, the method includesprocessing each of the images with the location of the facial regionsand the plurality of landmarks to isolation a region including each ofthe eyes. The method includes processing each of the regions, frame byframe, to identify a pupil region for each of the eyes. In an example,the region is configured as a rectangular region having an x-axis and ay-axis to border each of the eyes of the human user.

In an example, the processing comprises at least: processing the regionusing a grayscale conversion to output a grayscale image; processing thegrayscale image using an equalization process to output an equalizedimage; processing the equalized image using a thresholding process tooutput a thresholded image; processing the thresholded image using adilation and erosion process to output a dilated and eroded image; andprocessing the dilated and eroded image using a contour and momentprocess to output a finalized processed image. Of course, there can beother variations, modifications, and alternatives.

In an example, the method includes processing the finalized processedimage to identify a spatial location of each pupil in the region, eachpupil being identified by a two-dimensional spatial coordinate. Themethod includes processing information associated with each pupilidentified by the two d-dimensional spatial coordinate to output aplurality of two-dimensional spatial coordinates, each of which is inreference to a time, in a two-dimensional space. The method thenincludes outputting a gaze information about the human user. The gazeinformation includes the two dimensional spatial coordinates each ofwhich is in reference to a time in a two dimensional space.

According to one aspect of the invention a method of processinginformation including aligning eye movement with an image capturingdevice for detection of cognitive anomalies is described. One methodincludes initiating an application, under control of a processor, tooutput an image of a frame on a display device to a user, the displaydevice being coupled to the processor, the processor being coupled to acommunication device coupled to a network of computers, the network ofcomputers being coupled to a server device, initiating a camera coupledto the application to capture a video image of a face of a user, theface of the user being positioned by the user viewing the displaydevice, and displaying the video of the image of the face of the user(e.g. including their eyes, pupils, etc.) on the display device within avicinity of the frame being displayed. A process includes positioningthe face of the user within the frame to align the face to the frame,and capturing an image of the face of the user, processing capturedinformation regarding the image of the face to initiate an imagecapturing process of eye movement of the user, and outputting anindication on a display after initiation of the image capturing process;and moving the indication spatially to one of a plurality of imagesbeing displayed on the display device. A technique includes capturing avideo of at least one eye of the human user, while the user's head/faceis maintained within the viewing display of the device (e.g. visible tothe camera), and one or both eyes of the user moves to track theposition of the indication of the display, the image of each eyecomprising a sclera portion, an iris portion, and a pupil portion,parsing the video to determine a first reference image corresponding toa first eye position for a first spatial position for the indication;and a second reference image corresponding to a second eye position fora second spatial position of the indication, and correlating each of theother plurality of images to either the first reference image or thesecond reference image. In various embodiments, the parsing steps may beperformed on a user's computing device, or by a remote server.

According to another aspect of the invention, a method for processinginformation using a web camera, the web camera being coupled to acomputing system is disclosed. One technique includes placing a user infront of a display device coupled to the computing system, the computingsystem being coupled to a worldwide network of computers, initiating aNeurotrack application stored on a memory device of the computing systemand initiating the web camera by transferring a selected command fromthe Neurotrack application. A process may include capturing an image ofa facial region of the user positioned in front of the display device,retrieving a plurality of test images from the memory device coupled tothe computing system, the plurality of test images comprising a firstpair of images, a second pair of images, a third pair of images, etc.(e.g. twentieth pair of images), each of the pair of images beingrelated to each other, and displaying the first pair of the test imageson the display device to be viewed by the user. A method may includecapturing a plurality of first images associated with a first eyelocation while the user is viewing the first pair of test images,repeating the displaying of the pairs of images, while replacing one ofthe previous pairs of images, and capturing of images for a plurality ofsecond pair of images to the twentieth pair of images, each of whichwhile the user is viewing the display device, and capturing a fixationfrom an initial point having four regions within a vicinity of theinitial point during the displaying of the pairs of images, whilereplacing the previous pair of images, each of the four regions withinabout one degree visual angle from the initial point, and an associatedsaccade with the fixation. A process may include processing informationto filter the saccade, determining a visual preference using thefixation on the replaced image from the plurality of images; and usingthe visual preference information to provide the user with feedback.

According to another aspect of the invention, a method for playing amatching game on a host computer is disclosed. One technique may includeuploading from the host computer to a remote computer system, a computernetwork address for a plurality of static images, wherein the pluralityof static images comprises a first plurality of static images and asecond plurality of static images. A method may include uploading fromthe host computer to the remote computer system, remote computer systemexecutable software code including: first remote computer systemexecutable software code that directs the remote computer system todisplay on a display of the remote computer system to a player, onlystatic images from the first plurality of static images but not staticimages from the second plurality of static images, wherein each of thestatic images from the first plurality of static images is displayedupon at most half of the display for a first predetermined amount oftime, second remote computer system executable software code thatdirects the remote computer system to inhibit displaying on the displayof the remote computer system to the player, at least one static imagefrom the first plurality of static images to the player, for a secondpredetermined amount of time, third remote computer system executablesoftware code that directs the remote computer system to simultaneouslydisplay on the display of the remote computer system to the player, afirst static image from the first plurality of static images and asecond static image from the second plurality of static images, whereinthe first static image and the second static image are displayed upon atmost half of the display for a third predetermined amount of time,fourth remote computer system executable software code that directs theremote computer system to capture using a web camera of the remotecomputer system video data of the player, wherein the video datacaptures eye movements of the player while the display of the remotecomputer system is displaying to the player the first static image andthe second static image, fifth remote computer system executablesoftware code that directs the remote computer system to create editedvideo data from a subset of the video data in response to a pre-definedtwo dimensional area of interest from the video data, wherein the editedvideo data has a lower resolution than the video data, and sixth remotecomputer system executable software code that directs the remotecomputer to provide to the host computer, the edited video data. Aprocess may include determining with the host computer a first amount oftime representing an amount of time the player views the first staticimage and a second amount of time representing an amount of time theplayer views the second static image, in response to the edited videodata, determining with the host computer a viewing relationship for theplayer between the second amount of time and the first amount of time,in response to the first amount of time and the second amount of time,and determining with the host computer whether the viewing relationshipfor the player between the second amount of time and the first amount oftime exceeds a first threshold and generating a success flag in responsethereto. A technique may include providing from the host computer to theremote computer system, an indication that the player is successful, inresponse to the success flag.

The above embodiments and implementations are not necessarily inclusiveor exclusive of each other and may be combined in any manner that isnon-conflicting and otherwise possible, whether they be presented inassociation with a same, or a different, embodiment or implementation.The description of one embodiment or implementation is not intended tobe limiting with respect to other embodiments and/or implementations.Also, any one or more function, step, operation, or technique describedelsewhere in this specification may, in alternative implementations, becombined with any one or more function, step, operation, or techniquedescribed in the summary. Thus, the above embodiment implementations areillustrative, rather than limiting.

BRIEF SUMMARY OF DRAWINGS

FIGS. 1A-E illustrate a flow diagram according to an embodiment of thepresent invention;

FIG. 2 is a simplified diagram of a process according to an embodimentof the present invention;

FIG. 3 is a block diagram of typical computer system according tovarious embodiment of the present invention;

FIG. 4 is a graphical user interface of an embodiment of the presentinvention;

FIG. 5 is a simplified flow diagram of a process according to anembodiment of the present invention;

FIG. 6 is a simplified flow diagram of a process according to analternative embodiment of the present invention;

FIGS. 7 through 20 are simplified flow diagrams of various processes andrelated applications of using gaze information according to anembodiment of the present invention; and

FIG. 21 is a simplified block diagram of a process and module for anapparatus according to an embodiment of the present invention.

DETAILED DESCRIPTION

According to the present invention, techniques for processinginformation associated with eye movement using web based image-capturingdevices. Merely by way of example, the invention can be applied toanalysis of information for determining cognitive diseases.

Without limiting any of the interpretations in the claims, the followingterms have been defined.

Choroid: Layer containing blood vessels that lines the back of the eyeand is located between the retina (the inner light-sensitive layer) andthe sclera (the outer white eye wall).

Ciliary Body: Structure containing muscle and is located behind theiris, which focuses the lens.

Cornea: The clear front window of the eye which transmits and focuses(i.e., sharpness or clarity) light into the eye. Corrective lasersurgery reshapes the cornea, changing the focus.

Fovea: The center of the macula, which provides the sharp vision.

Iris: The colored part of the eye which helps regulate the amount oflight entering the eye. When there is bright light, the iris closes thepupil to let in less light. And when there is low light, the iris opensup the pupil to let in more light.

Lens: Focuses light rays onto the retina. The lens is transparent, andcan be replaced if necessary. Our lens deteriorates as we age, resultingin the need for reading glasses. Intraocular lenses are used to replacelenses clouded by cataracts.

Macula: The area in the retina that contains special light-sensitivecells. In the macula these light-sensitive cells allow us to see finedetails clearly in the center of our visual field. The deterioration ofthe macula is commonly occurs with age (age related macular degenerationor ARMD).

Optic Nerve: A bundle of more than a million nerve fibers carryingvisual messages from the retina to the brain. (In order to see, we musthave light and our eyes must be connected to the brain.) Your brainactually controls what you see, since it combines images. The retinasees images upside down but the brain turns images right side up. Thisreversal of the images that we see is much like a mirror in a camera.Glaucoma is one of the most common eye conditions related to optic nervedamage.

Pupil: The dark center opening in the middle of the iris. The pupilchanges size to adjust for the amount of light available (smaller forbright light and larger for low light). This opening and closing oflight into the eye is much like the aperture in most 35 mm cameras whichlets in more or less light depending upon the conditions.

Retina: The nerve layer lining the back of the eye. The retina senseslight and creates electrical impulses that are sent through the opticnerve to the brain.

Sclera: The white outer coat of the eye, surrounding the iris.

Vitreous Humor: The clear, gelatinous substance filling the centralcavity of the eye.

Novelty preference: Embodiments of the present invention assessrecognition memory through comparison of the proportion of time anindividual spends viewing a new picture compared to a picture they havepreviously seen, i.e., a novelty preference. A novelty preference, ormore time spent looking at the new picture, is expected in users (e.g.individuals, test subjects, patients) with normal memory function. Bycontrast, users with memory difficulties (cognitive impairments) arecharacterized by more equally distributed viewing times between thenovel and familiar pictures. The lack of novelty preference suggests acognitive dysfunction with regards to what the subject has alreadyviewed.

Cameras capturing images/videos (e.g. web cameras) are increasingly partof the standard hardware of smart phones, tablets and laptop computers.The quality and cost of these devices has allowed for their increaseduse worldwide and are now a standard feature on most smart devices,including desktop and laptop computers, tablets, and smart phones. Theinventor of the present invention has recognized that it is possible toincorporate the use of such web cameras for visual recognition tasks. Inparticular, the inventor has recognized that using such web cameras, hecan now provide web-based administration of visual recognition tasks.

Advantages to embodiments of the present invention include that suchvisual recognition tasks become very convenient for subjects. Subjectsneed not travel to and from an administration facility (e.g. laboratory,doctor's office, etc.) and the subjects can have such tasks performedfrom home. Other advantages include that the visual recognition taskscan be administrated by a technician remote from the user, or the taskscan be administrated by a programmed computer.

Still other advantages to embodiments of the present invention includethat the subject's performance on such tasks may be evaluated remotelyby an administrator or in some instances by a computer programmed withanalysis software. Other advantages include that the subject's test datamay be recorded and later reviewed by researchers if there is anyquestion about the test results, whether evaluated by an administratoror by a software algorithm implemented on a computer.

FIG. 1 illustrates a flow diagram according to an embodiment of thepresent invention. Initially, a subject/user directs their web browseron their computing device to a web page associated with embodiments ofthe present invention, step 100. In various embodiments, the user mayperform this function by entering a URL, selecting a link or icon,scanning a QR code, or the like. Next, in some embodiments, the user mayregister their contact information to receive their results, or thelike.

Next, in response to the user request, a web server provides data backto the user's device, step 110. In various embodiments, the data mayinclude multiple images for use during the recognition task, as well asprogram code facilitating the recognition task, as described below. Insome examples, the program code may include code that may run via thebrowser, e.g. Adobe Flash code, Java code, AJAX, HTML5, or like. Inother examples, the program code may be a stand-alone executableapplication that runs upon the user's computer (Mac or PC).

Initially a series of steps are performed that provide a calibrationfunction. More specifically, in some embodiments the front-facing cameraon a user's computing system (e.g. computer, smart device, or the like)is turned on and captures images of the user, step 120. The live imagesare displayed back to the user on the display of the computing system,step 130. In some embodiments, a mask or other overlay is also displayedon the display, and the user is instructed to either move their head,camera, computing device, or the like, such that the user's head iswithin a specific region, step 140. In some examples, the mask may be arectangular, ovoid, circular region, or the like generally within thecenter of a field of view of the camera.

FIG. 4 illustrates an example of an embodiment of the present invention.More specifically, FIG. 4 illustrates a typical graphical user interface700 that is displayed to a user that is mentioned in FIG. 1A, steps130-150. As can be seen in GUI 700, the user is displayed a series ofinstructions 710, and an overlay frame 720. Instructions 710 instructthe user how to position their head 730 relative to overlay frame 720.In this example, the user moves their head 730 such that their head fitswithin overlay frame 720 before the calibration process begins.

Next, in some embodiments, a determination is made as to whether theeyes, more specifically, the pupils of the user can be clearly seen inthe video images, step 150. This process may include a number of trial,error, and adjustment feedback by the computing device and the user. Forexample, adjustments may be made to properties of the video camera, suchas gain, ISO, brightness, or the like; adjustments may includeinstructions to the user to increase or decrease lighting; or the like.In various embodiments, this process may include using image recognitiontechniques for the user's pupil against the white of the user's eye, todetermine whether the pupil position can be distinguished from the whiteof the eye in the video. Once the system determines that the eyes can besufficiently tracked, the user is instructed to maintain these imagingconditions for the duration of the visualization task.

As illustrated in FIG. 1, the process then includes displaying a smallimage (e.g. dot, icon, etc.) on the display and moving the image aroundthe display and the user is instructed to follow the image with theireyes, step 160. In various embodiments, the locations of the dot on thedisplay are preprogrammed and typically incudes discrete points orcontinuous paths along the four corners of the display, as well as nearthe center of the display. During this display process, a video of theuser's eyes are recorded by the camera, step 170. In some embodiments,the video may be the full-frame video captured by the camera or thevideo may be a smaller region of the full-frame video. For example, thesmaller region may be roughly the specific region mentioned in step 140,above (e.g. oval, circle, square, rectangular, etc.); the smaller regionmay be a specific region where the user's eyes are located (e.g. smallrectangle, etc.); the smaller region may be a region capturing the faceof a user (e.g. bounding rectangle, etc.); or the like. Such embodimentsmay be computationally advantageous by facilitating or reducing thecomputations and analysis performed by the user's computer system or bya remote computing system. Further, such embodiments may be advantageousby greatly reducing communications to, data storage of, and computationsby the remote server. In one example, the video is captured using avideo capture program called HDFVR, although any other programs (e.g.Wowza GoCoder, or the like) may be used other implementations.

Next, in various embodiments, an analysis is performed upon the videocaptured in step 170 based upon the display in step 160 to determine agaze model, step 180. More specifically, the position of the user'spupil with regards to the white of the eye is analyzed with respect tothe locations of the dot on the display. For example, when the dot isdisplayed on the upper right hand side of the display, the position ofthe user's pupils at the same time are recorded. This recorded positionmay be used to determine a gaze model for the user. In one specificexample, the dot is displayed on the four corners and the center of thedisplay, and the corresponding positions of the user's pupils are usedas principal components, e.g. eigenvector, for a gaze model for theuser. In other examples, a gaze model may include a larger or smaller(e.g. two, signifying left and right) number of principal components. Inother embodiments, other representations for gaze models may be used.

In various embodiments, the video (or smaller video region) is combinedwith metadata, and sent to a remote server (e.g. analysis server), step190. In some examples, the metadata in embedded in the video on a frameby frame basis (e.g. interleaved), and in other examples, the metadatamay be sent separately or at the end of the video. The inventor believesthere are computational advantages to interleaving metadata with eachrespective video image compared to a separate metadata file and videoimage file. For example, in some embodiments eye gaze position data fora specific video image is easily obtained from metadata adjacent to thatframe. In contrast, in cases of a single metadata file, the computermust maintain an index in the metadata and the video images and hopethat the index synchronization is correct.

In some embodiments, the metadata may include some combination, but notnecessarily all of the following data, such as: camera setting data,data associated with the user (e.g. account name, email address),browser setting data, timing data for the dots on the display, the gazemodel, a determined gaze position, and the like. As examples, the datamay provide a correspondence between when a dot is positioned on theupper right corner of the display and an image of how the user's eyesappear in the video at about the same time; the meta data may includetiming or a series of frame numbers; or the like. In one example, thecombined file or data stream may be a Flash video file, e.g. FLV, a webreal-time communications file (WebRTC), or the like. Further, the remoteserver may be a cloud-based video server, such as a Wowza Amazon webservice, or others. In one embodiment, an instance of Wowza can be usedto store the uploaded all of the integrated video and metadata datadiscussed herein. In some embodiments, to reduce communications to, datastorage of, and computations by the remote server, the frame rate of thevideo transferred is at the recording frame rate, e.g. 25 frames persecond, 60 frames per second, or the like. However the remote server mayrecord the video at less than the recording frame rate; in otherembodiments, the frame rate of the transferred video to the remoteserver may be less, e.g. from about 2 to 3 frames per second up to therecording frame rate.

Next, in various embodiments, a series of steps may be performed thatdetermine whether the gaze model is usable, or not. Specifically, theprocess includes displaying a small dot, similar to the above, tospecific locations on the display, and the user is instructed to stareat the dot, step 200. As the user watches the dot, the video cameracaptures the user's eyes, step 210. Next, using the full-frame video, ora smaller region of the video, images representing the pupils of theuser's eyes are determined, step 220.

In various embodiments, using principal component analysis, the imagesof the pupils are matched to the gaze model (e.g. eigenvectors) todetermine the principal components (e.g. higher order eigenvalues) forthe pupils with respect to time. As merely examples, if the user islooking to the center left of the display at a particular time, theprincipal components determined may be associated with the upper leftand lower left of the display, from the gaze model; if the user islooking to the upper center of the display at a different time, theprincipal components determined may be associated with the center, theupper right and upper left of the display, from the gaze model; and thelike. Other types of matching algorithms besides principal componentanalysis may be implemented in other embodiments of the presentinvention, such as least squares, regression analysis, or the like. Instill other embodiments, this process may include determining one ormore visual landmarks of a user's face, and pattern matching techniquesto determine geometric features, e.g. position and shape of the user'seyes, locations of pupils, direction of pupil gaze, and the like. Invarious embodiments, if the images of the pupils corresponding to theset display positions do not match the gaze model, the process above maybe repeated, step 230.

In various embodiments, similar to step 190 above, the video (or smallervideo region) may be combined with metadata (e.g. timing orsynchronization data, an indication of where the dot is on the screenwhen the image of the user's eyes are captured, and the like), and sentback to the remote server, step 240.

Once the gaze model is validated, a series of steps providing afamiliarization phase are performed. More specifically, in someembodiments, one or more images are displayed to the user on thedisplay, step 250. In some examples, the images are one that wereprovided to the user's computing system in step 110, and in otherexamples, (e.g. using AJAX), the images are downloaded on-demand, e.g.after step 110.

In various embodiments, the images are specifically designed for thisvisualization task. In one example, the images are all binary imagesincluding objects in black over a white background, although otherexamples may have different object and background colors. In someembodiments, the images may be gray scale images (e.g. 4-bit, 8-bit,etc.), or color images (e.g. 4-bit color, or greater). Further, in someembodiments, the images are static, whereas in other embodiments, theimages may be animated or moving. Additionally, in some embodiments,images designed for this visualization task are specifically designed tohave a controlled number of geometric regions of interest (e.g. visualsaliency).

The number of geometric regions of interest may be determined based uponexperimental data, manual determination, or via software. For example,to determine experimental data, images may be displayed to a number oftest subjects, and the locations on the image where the test subjectseyes linger upon for over a threshold amount of time may be considered ageometric region of interest. After running such experiments, testimages may become identifiable via the number of geometric regions ofinterest. As an example, an image of a triangle may be characterized bythree regions of interest (e.g. the corners), and an image of a smileyface may be characterized by four regions of interest (e.g. the twoeyes, and the two corners of the mouth). In other embodiments, geometricregions of interest may be determined using image processing techniquessuch as Fourier analysis, morphology, or the like. In some embodiments,the images presented to the user, described below, may each have thesame number of regions of interest, or may have different numbers ofregions of interest, based upon specific engineering or researchpurposes.

In some embodiments, an image is displayed to the left half of thedisplay and to the right half of the display for a preset amount oftime. The amount of time may range from about 2 seconds to about 10seconds, e.g. 5 seconds. In other embodiments, different pairs of imagesmay be displayed to the user during this familiarization phase trial.

During the display of the images, the video camera captures the user'seyes, step 260. Next, using the full-frame video, or a smaller region ofthe video, images representing the pupils of the user's eyes aredetermined. In some embodiments, using principal component analysis, orthe like, of the gaze model, the gaze position of the user's eyes aredetermined with respect to time, step 280. In various embodiments,similar to step 190 above, the video (or smaller video region) may becombined with metadata (e.g. including an indication of what isdisplayed on the screen at the time the specific image of the user'seyes is captured, etc.), and the data, or portions of the data may besent back to the remote server, step 290. In another embodiment, thevideo (or smaller video region) may be combined with metadata (e.g.including an indication of what is displayed on the screen at the timethe specific image of the user's eyes is captured, etc.), and processedon the user's device (e.g., computer, phone).

In various embodiments, this process may then repeat for a predeterminednumber of different pictures (or iterations), step 300. In someexamples, the process repeats until a predetermined number sets ofimages are displayed. In some embodiments, the predetermined number iswithin a range of 10 to 20 different sets, within a range of 20 to 30different sets, within a range of 30 to 90 different sets, althoughdifferent numbers of trials are contemplated. In some embodiments, thefamiliarization phase may take about 1 to 3 minutes, although otherdurations can be used, depending upon desired configuration.

Subsequent to the familiarization phase, a series of steps providing atest phase are performed. More specifically, in some embodiments, oneimage that was displayed within the familiarization phase is displayedto user along with a novel image (that was not displayed within thefamiliarization phase) on the display, step 310. Similar to the above,in some examples, the novel images are ones that were provided to theuser's computing system in step 110, whereas in other examples, (e.g.using AJAX), the images are downloaded on-demand, e.g. after step 110.In some embodiments, the novel images may be variations of or related tothe familiar images that were previously provided during thefamiliarization phase. These variations or related images may bevisually manipulated versions of the familiar images. In some examples,the novel images may be the familiar image that is rotated, distorted(e.g. stretched, pin cushioned), resized), filtered, flipped, and thelike, and in other examples, the novel images may be the familiar imagethat have slight changes, such as subtraction of a geometric shape (e.g.addition of a hole), subtraction of a portion of the familiar image(e.g. removal of a leg of a picture of a table), addition of an extrageometric feature (e.g. adding a triangle to an image), and the like. Invarious embodiments, the manipulation may be performed on the server andprovided to the user's computing system, or the manipulation may beperformed by the user's computing system (according to directions fromthe server). In various embodiments,

In various embodiments, the novel images are also specifically designedto be similar to the images during the familiarization phase inappearance (e.g. black over white, etc.) and are designed to have acontrolled number of geometric regions of interest. As an example, thenovel images may have the same number of geometric regions of interest,a higher number of geometric regions of interest, or the like.

During the display of the novel and familiar images, the video cameracaptures the user's eyes, step 320. Next, using the full-frame video, ora smaller region of the video, images representing the pupils of theuser's eyes are optionally determined. In various embodiments, usingprincipal component analysis, or the like, of the gaze model, the gazeposition of the user's eyes are determined with respect to time, step340.

In some embodiments, based upon the gaze position of the user's eyesduring the display of the novel image and the familiar image (typicallywith respect to time), a determination is made as to whether the usergazes at the novel image for a longer duration compared to the familiarimage, step 350. In some embodiments, a preference for the novel imagecompared to the familiar image may be determined based upon gaze time(51% novel to 49% familiar); a threshold gaze time (e.g. 60% novel to40% familiar, or the like); based upon gaze time in combination with anumber of geometric regions of interests (e.g. 4 novel versus 3familiar); based upon speed of the gaze between geometric regions ofinterest (e.g. 30 pixels/second novel versus 50 pixels/second familiar);or the like. In light of the present patent disclosure, other types ofgaze factors and other proportions of novel versus familiar may becomputed. In various embodiments, the novel image or familiar imagepreference is stored as metadata, step 360. In various embodiments,similar to step 190 above, the video (or smaller video region) may becombined with metadata (e.g. an indication of which images are displayedon the right-side or left-side of the display, etc.), and sent back tothe remote server, step 370.

In various embodiments, this process may then repeat for a predeterminednumber of different sets of novel and familiar images, step 380. In someexamples, the testing phase process repeats until 10 to 20 differentsets of images (e.g. iterations) are displayed to the user, althoughdifferent numbers of trials (e.g. 20 to 30 iterations, etc.) are alsocontemplated. In some embodiments, novel images that are displayed mayhave an increasing or decreasing number of geometric regions of interestas the test phase iterates, depending upon performance of the user. Forexample, if a gaze of a user is not preferencing the novel image overthe familiar image, the next novel image displayed to the user may havea greater number of geometric regions of interest, and the like. Othertypes of dynamic modifications may be made during the test phasedepending upon user performance feedback.

In some embodiments, after the test phase, the gaze position data may bereviewed to validate the scores, step 385. In some embodiments, the gazeposition data with respect to time may be reviewed and/or filtered toremove outliers and noisy data. For example, if the gaze position dataindicate that a user never looks at the right side of the screen, thegaze model is probably incorrect calibrated, thus the gaze model andgaze data may be invalidated; if the gaze position data indicate thatthe user constantly looks left and right on the screen, the capturedvideo may be too noisy for the gaze model to distinguish between theleft and the right, thus the gaze position data may be invalidated; orthe like. In various embodiments, the gaze position data may not only beable to track right and left preference, but in some instances nine ormore different gaze positions on the display. In such cases, the gazeposition data (for example, a series of (x,y) coordinate pairs), may befiltered in time, such the filtered gaze position data is smooth andcontinuous on the display. Such validation of gaze position data mayautomatically performed, or in some cases, sub-optimally, by humans.

In various embodiments, after data validation, the preferencing datadetermined above in step 350 and or in step 385 may be used to determinea cognitive performance score for the user, step 390. For example, ifthe user shows a preference for the novel image over the familiar imagefor over about 70% of the time (e.g. 67%), the user may be given apassing or success score; if the user has a preference of over about 50%(e.g. 45%) but less than about 70% (e.g. 67%), the may be given aqualified passing score; if the user has no preference, e.g. less thanabout 50% (e.g. 45%) the user may be given an at risk score or notsuccessful score. In some embodiments, based upon an at-risk user'sscore, a preliminary diagnosis indicator (e.g. what cognitive impairmentthey might have) may be given to the user. The number of classificationsas well as the ranges of preference may vary according to specificrequirements of various embodiments of the present invention. In someembodiments where step 390 is performed on the user's computer, thisdata may also be uploaded to the remote server, whereas if step 390 isperformed on a remote server, this data may be provided to the user'scomputer. In some embodiments, the uploaded data is associated with theuser in the remote server. It is contemplated that the user may requestthat the performance data be shared with a health care facility viapopulating fields in the user's health care records, or on a socialnetwork.

In some embodiments of the present invention, the user's computer systemmay be programmed to perform none, some, or all of the computationsdescribed above (e.g. calibration phase, validation phase,familiarization phase, and/or test phase). In cases where not all of thecomputations are performed by the user's computer system, a remoteserver may process the uploaded data based upon the video images andmetadata, e.g. timing, synchronization data, indication of which imagesare displayed on the right-side and the left-side of the display at thetime the image of the user's eyes is captured, and the like. In someembodiments, the remote server may return the computed data, e.g. gazemodel and principal component analysis results, to the user's computersystem, whereas in other embodiments, such computed data is onlymaintained by the remote server.

In various embodiments the computations performed within the test phaseto determine the preferencing between the novel image compared to thefamiliar image may also be partially or completely performed on theuser's computer and/or by the remote server. In some embodiments, thedetermination of preferencing may be made by determining a number offrames having principal components (or other algorithm) of the novelimage compared to the number of frames having principal components (orother algorithm) of the familiar image. For example, if the percentageof frames within a test phase, based upon the user's gaze position,where the user is looking at the novel image exceeds a threshold, theuser may be considered to successfully pass the test.

In other embodiments, the evaluation of whether the user is looking atthe novel image or the familiar image may be performed by one or moreindividuals coupled to the remote server. For example, administratorsmay be presented with the video images of the user's eyes, and basedupon their human judgment, the administrator may determine whether theuser is looking to the left or to the right of the display, whether theuser is blinking, whether the image quality is poor, and the like. Thisdetermination is then combined by the remote server with the indicationof whether the novel image is displayed on the left or the right of thedisplay, to determine whether which image the user is looking at duringthe human-judged frame. In some initial tests, three or moreadministrators are used so a majority vote may be taken. The inventor isaware that manual intervention may raise the issue of normal variabilityof results due to subjectivity of the individuals judging the images aswell as of the user taking the test, e.g. fatigue, judgment, bias,emotional state, and the like. Such human judgments may be more accuratein some respects, as humans can read and take into accounts emotions ofthe user. Accordingly, automated judgments made by algorithms run withinthe remote server may be less reliable in this respect, as algorithms toattempt to account for human emotions are not well understood.

In some embodiments, the process above may be implemented as a game,where the user is not told of the significance of the images or thetesting. In such embodiments, feedback may be given to the user basedupon their success in having a preference of the novel image, step 410.As examples of user feedback may include: a sound being played such as:a triumphant fanfare, an applause, or the like; a running score totalmay increment and when a particular score is reached a prize may be sent(via mail) to the user; a video may be played; a cash prize may awardedto the user; a software program may become unlocked or available fordownload to the user; ad-free streaming music may be awarded; tickets toan event may be awarded; access to a VIP room; or the like. In light ofthe present patent disclosure, one of ordinary skill in the art willrecognize many other types of feedback to provide the user in otherembodiments of the present invention.

In other embodiments of the present invention, if the user is identifiedas not being successful or at risk, the user is identified as acandidate for further testing, step 420. In various embodiments, theuser may be invited to repeat the test; the user may be invited toparticipate in further tests (e.g. at a testing facility, office, lab);the user may be given information as to possible methods to improve testperformance; the user may be invited to participate in drug or lifestylestudies; the user may be awarded a care package; or the like. In lightof the present patent disclosure, one of ordinary skill in the art willrecognize many other types of feedback to provide the user in otherembodiments of the present invention. Such offerings for the user may bemade via electronic communication, e.g. email, text, via telephone call,video call, physical mail, social media, or the like. Step 430. In otherembodiments, prizes, gifts, bonuses, or the like provided in step 410may also be provided to the user in step 430.

In one embodiment, the user is automatically enrolled into cognitivedecline studies, step 440. As part of such studies, the user may takeexperimental drugs or placebos, step 450. Additionally, or instead, aspart of such studies, the user may make lifestyle changes, such asincreasing their exercise, changing their diet, playing cognitive games(e.g. crossword puzzles, brain-training games, bridge, or the like),reducing stress, adjusting their sleeping patterns, and the like. Insome embodiments, the user may also be compensated for theirparticipation in such studies by reimbursement of expenses, payment fortime spent, free office visits and lab work, and the like.

In various embodiments, the user may periodically run theabove-described operations to monitor their cognitive state over time,step 460. For example, in some embodiments, the user may take the abovetest every three to six months (the first being a baseline), and thechanges in the user's performance may be used in steps 390 and 400,above. More specifically, if the user's percentage preference for thenovel image drops by a certain amount (e.g. 5%, 10%, etc.) between thetests, step 400 may not be satisfied, and the user may be identified forfurther testing.

In other embodiments of the present invention, various of the abovedescribed steps in FIGS. 1A-1E may be performed by a remote server, andnot the user's computer (e.g. client). As examples: step 180 may beperformed after step 190 in the remote server; steps 220 and 230 may beperformed after step 240 in the remote server; step 280 may be performedafter step 290 in the remote server; steps 340-360 may be performedafter step 370 on the remote server; steps 390-400 may be performed inthe remote server; steps 410-440 may be performed, in part, by theremote server; and the like. The division of the processing may be madebetween the user's computer and the remote server based upon engineeringrequirements or preferences, or the like.

FIG. 2 is a simplified diagram of a process according to an embodimentof the present invention. More specifically, FIG. 2 illustrates examplesof an image 500 that may be displayed to the subject within thecalibration phase; examples of an image 510 that may be displayed to thesubject within the validation phase; examples of familiar images 520that may be displayed to the subject within the familiarization phase;and examples of an image 530 that may be displayed to the subject withinthe test phase including a familiar image 540 and a novel image 550.Additionally, as shown in FIG. 2, an example 560 of feedback given tothe user is shown. Some embodiments, the feedback may be instantaneous,or arrive later in the form of an electronic or physical message, or thelike.

In other embodiments of the present invention, other types of eyetracking task paradigms and studies may be performed besides the onesdescribe above, such as: attention and sequencing tasks, set-shiftingtasks, visual discrimination tasks, and emotional recognition, biastasks, and the like. Additional studies may include processing ofadditional biological information including blood flow, heart rate,pupil diameter, pupil dilation, pupil constriction, saccade metrics, andthe like. The process through which biological data and cognitive taskperformance is collected may be similar to one of the embodimentsdescribed above. Additionally, scoring procedures can determine thelocation and change over time of various landmarks of the participant'sface (e.g., pupil diameter, pupil dilation). These procedures can alsoestimate the eye gaze position of each video frame.

In alternative embodiments, referring below to FIGS. 7 through 20, weshow simplified flow diagrams of various processes and relatedapplications of using gaze information according to an embodiment of thepresent invention. In an example, the processes can be configured withadditional applications, such as those provided below.

Visual Image Pairs

In an example, the present process includes a visual image pairs (“ImagePairs”) process. As shown referring to FIG. 7, the image pairs processcomprises a multimodal memory assessment that has two tasks. In anexample, the process includes outputting semantic pairs to a displayduring a learning phase. Concurrently, video images of a subject arecaptured while viewing the semantic pairs to learn. Using the presenttechniques, the process determines the subject's gaze position basedupon a gaze model for the subject and related video images. In anexample, video images and metadata are transferred to via a network to aremove server. The process repeats these steps until the learning phaseis complete. Further details of a test phase and a gaze process can befound throughout the present specification and more particularly below.

Now referring to FIG. 8 the process captures video images of a subjectwhile viewing the learned pairs and novel pairs in a test phase. Usingthe present techniques, the process determines the subject's gazeposition based upon a gaze model for the subject and video images. In anexample, the process determines the subject's visual recognition memorybased upon the gaze location and duration with respect to time, or on atime based metric. In an example, the process stores the subject'svisual recognition as metadata in storage. In an example, the videoimages and metadata are provided and stored in a remoter server. Theprocess then determines whether the test phase is completed, and thenmanually or automatically reviews the subject's gaze position data forthe test phase. Further details of a test phase and a gaze process canbe found throughout the present specification and more particularlybelow.

By collecting two measures of memory performance during one testadministration, the test can identify single-domain mild cognitiveimpairment with a relatively low burden on the test-taker. The firstpart of the test is a visual paired comparison (VPC) task that utilizeswebcam-based eye tracking data to assess visual recognition memory. VPCtasks are a method of memory assessment. Briefly, participants are showna series of identical image pairs during the familiarization phase,followed by a series of about twenty (20) disparate image pairs(containing one image from the familiarization phase and one novelimage) during the testing phase. VPC tasks produce a novelty preferencescore by quantifying the amount of time a person spends viewing the newimages compared to familiar images during the testing phase. VPC taskscan stratify populations by memory function, with higher noveltypreference scores indicating normal memory and lower novelty preferencescores indicating impaired memory function. The second part of the ImagePairs test is a visual paired recognition (PR) task that utilizes hapticfeedback data in addition to eye tracking metrics to provide a secondmeasure of visual learning and memory. This task utilizes apaired-associate learning paradigm, which stratify clinical populationsby memory function. This part of the test assesses the participant'slearning and memory of the image pairs shown during the VPC task.Participants are instructed to discriminate between image pairs thatexactly match the pairs viewed during the VPC testing phase and thosethat do not across fifty (50) trials. The image pairs used during thistask contain a mix of identical images from the previous task (targets),altered images from the previous task (foils), and entirely new images(shams). Outcome variables include target accuracy, foil accuracy, shamaccuracy, reaction time, and d-prime. In an example, d-prime is asensitivity index or d (pronounced ‘dee-prime’) is a statistic used insignal detection theory. D-prime provides the separation between themeans of the signal and the noise distributions, compared against thestandard deviation of the signal or noise distribution.

Visual Semantic Paired Associates

In an example, the Visual Semantic Paired Associates (VSPA) is anothertask that measures learning and memory. Participants are shown pairs ofwords across multiple categories in a learning trial and asked toremember as many of the pairs as they can. A recognition trial is thenadministered to measure their learning of the first set of words.Participants are then shown a second list of words with some de-coupledwords from first list (proactive interference) and tested on arecognition trial on the second list, then another recognition trialwill re-test the first list (retroactive interference). Alternateversions of the test allow for repeated testing. Outcome variablesinclude target accuracy, d prime, reaction time, gaze location, gazeduration, and other eye patterns indicative of learning and memory.

Paired Symbol Digit Comparison

Referring now to FIG. 9, the process includes a paired symbol digitcomparison. As shown, the process outputs symbol pairs to a displayduring a test trial. The process captures video images of a subjectwhile viewing the symbol pairs to compare in a test phase. Using thepresent techniques, the process determines the subject's gaze positionbased upon a gaze model for the subject and video images. In an example,the process determines the subject's visual recognition memory basedupon the gaze location and duration with respect to time, or on a timebased metric. In an example, the process stores the subject's visualrecognition as metadata in storage. In an example, the video images andmetadata are provided and stored in a remoter server. The process thendetermines whether the test phase is completed, and then manually orautomatically reviews the subject's gaze position data for the testphase. Further details of a test phase and a gaze process can be foundthroughout the present specification and more particularly below.

In an example, paired symbol digit is a processing speed and executivefunctioning task that utilizes a paired verification or rejectionparadigm (forced choice). Participants are instructed to determinewhether two symbols are equal or unequal utilizing a legend with ninenumber/symbol pairs. At the conclusion of the task, a brief implicitlearning trial is administered without the legend present. Outcomevariables include target accuracy, d prime, reaction time, gazelocation, gaze duration, and other eye patterns indicative of learningand memory.

Paired Arithmetic Comparison

Referring now to FIG. 10, the process includes a paired arithmeticcomparison. As shown, the process outputs symbol pairs to a displayduring a test trial. The process captures video images of a subjectwhile viewing the symbol pairs to compare in a test phase. Using thepresent techniques, the process determines the subject's gaze positionbased upon a gaze model for the subject and video images. In an example,the process determines the subject's visual recognition memory basedupon the gaze location and duration with respect to time, or on a timebased metric. In an example, the process stores the subject's visualrecognition as metadata in storage. In an example, the video images andmetadata are provided and stored in a remoter server. The process thendetermines whether the test phase is completed, and then manually orautomatically reviews the subject's gaze position data for the testphase. Further details of a test phase and a gaze process can be foundthroughout the present specification and more particularly below.

In an example, paired arithmetic is a processing speed and mentalarithmetic task that utilizes a paired verification or rejectionparadigm (forced choice). Participants are instructed to determinewhether the two arithmetic equations (e.g., addition, subtraction, etc.)are equal or unequal. Outcome variables include target accuracy, dprime, reaction time, gaze location, gaze duration, and other eyepatterns indicative of learning and memory.

Paired Line Orientation

Referring now to FIG. 11, the process includes a paired lineorientation. As shown, the process outputs line pairs to a displayduring a test trial. The process captures video images of a subjectwhile viewing the line pairs to compare in a test phase. Using thepresent techniques, the process determines the subject's gaze positionbased upon a gaze model for the subject and video images. The processtransfers the video images and metadata through a network into a remoteserver. In an example, the process determines the subject's visualrecognition performance based upon the gaze location and duration withrespect to time, or on a time based metric. In an example, the processstores the subject's visual recognition as metadata in storage. In anexample, the video images and metadata are provided and stored in aremoter server. The process then determines whether the test phase iscompleted, and then manually or automatically reviews the subject's gazeposition data for the test phase. Further details of a test phase and agaze process can be found throughout the present specification and moreparticularly below.

In an example, paired line orientation is a speeded visualdiscrimination and spatial working memory task utilizing a pairedcomparison paradigm. The task requires participants to choose which oftwo angled lines is parallel to a model line exposed for a brief periodof time followed by a brief delay. Outcome variables include targetaccuracy, reaction time, gaze location and gaze duration.

Paired Line Length

Referring now to FIG. 12, the process includes a paired line length. Asshown, the process outputs line pairs to a display during a test trial.The process captures video images of a subject while viewing the linepairs to compare in a test phase. Using the present techniques, theprocess determines the subject's gaze position based upon a gaze modelfor the subject and video images. The process transfers the video imagesand metadata through a network into a remote server. In an example, theprocess determines the subject's visual recognition performance basedupon the gaze location and duration with respect to time, or on a timebased metric. In an example, the process stores the subject's visualrecognition as metadata in storage. In an example, the video images andmetadata are provided and stored in a remoter server. The process thendetermines whether the test phase is completed, and then manually orautomatically reviews the subject's gaze position data for the testphase. Further details of a test phase and a gaze process can be foundthroughout the present specification and more particularly below.

In an example, paired line length is a speeded visual discrimination andspatial working memory task utilizing a paired comparison paradigm. Thetask requires participants to choose which of two lines is longer withlines offset to one another at different positions. Lines are presentedfor a period of time and participants are asked to respond. Outcomevariables include target accuracy, reaction time, gaze location and gazeduration.

Paired Feature Binding

Referring now to FIG. 13, the process includes a paired feature binding.As shown, the process outputs figure pairs to a display during a testtrial. The process captures video images of a subject while viewing theline pairs to compare in a test phase. Using the present techniques, theprocess determines the subject's gaze position based upon a gaze modelfor the subject and video images. The process transfers the video imagesand metadata through a network into a remote server. In an example, theprocess determines the subject's visual recognition performance basedupon the gaze location and duration with respect to time, or on a timebased metric. In an example, the process stores the subject's visualrecognition as metadata in storage. In an example, the video images andmetadata are provided and stored in a remoter server. The process thendetermines whether the test phase is completed, and then manually orautomatically reviews the subject's gaze position data for the testphase. Further details of a test phase and a gaze process can be foundthroughout the present specification and more particularly below.

In an example, paired feature binding is a speeded visual discriminationand spatial working memory task. The task requires participants tochoose whether two images have remained the same with respect tofeatures such as location, color, shape, glyph, or number. Variousfigures (or drawings) are presented during a familiarization phase, thena brief delay followed by a test phase.

Paired Price Comparison

Referring now to FIGS. 14 and 15, the process includes a learning phaseand a test phase for a paired price comparison. In the learning phase,the process outputs an item with seven price pairs to display for asubject. The process captures video images of the subject while thesubject is viewing the item price pairs to learn. The process determinesthe subject's gaze position based upon a gaze model for the subject andvideo images. In an example, the video images and metadata aretransferred and stored to a remote server. The process determineswhether the learning phase is complete, or repeats any of theaforementioned steps. The process outputs the learning item price pairsand novel pairs to display during a test phase. Further details of atest phase and a gaze process can be found throughout the presentspecification and more particularly below.

In the test phase, video images of the subject are captured while thesubject is viewing the learned pairs and novel pairs. The processdetermines the subject's gaze position based upon a gaze model for thesubject and the video images. In an example, the process determines thesubject visual recognition memory based upon the gaze location andduration with respect to time or other time frame. In an example, theprocess determines whether the test is complete, or repeats any of theaforementioned steps. The process manually or automatically reviews thesubject's gaze position data in association with the learned pairsand/or novel pairs. Further details of a test phase and a gaze processcan be found throughout the present specification and more particularlybelow.

In an example, paired price comparison is a brief visual pairedassociate paradigm. This task requires participants to learn eight (8)food/price pairs and discriminate between target and foil pairs duringtwenty-four (24) paired recognition trials, although there can be othervariations. Outcome variables include target accuracy, d prime, reactiontime, gaze location, gaze duration, and other eye patterns indicative oflearning and memory.

Sequencing

In an example, sequencing comprises of two parts (I&II) in which theparticipant is instructed to connect a series of dots as quickly aspossible while still maintaining accuracy. In part one the participantneeds to correctly connect lines in numerical order. In the second partthe participant is required to sequence numbers and letters inalternating order while preserving numerical and alphabetical order. Thetest provides information about visual search speed, scanning, speed ofprocessing, mental flexibility, and executive functioning. Outcomevariables include gaze location, gaze duration, completion time, anderrors.

In the test phase in FIG. 16, the process provides for sequencing. In anexample, the process outputs alphanumeric stimuli to a display during atest trial. In an example, video images of the subject are capturedwhile the subject is viewing the stimuli to sequence. The processdetermines the subject's gaze position based upon a gaze model for thesubject and the video images. In an example, the process transfers thevideo images and metadata to a remote server for storage. In an example,the process determines the subject's visual recognition performancebased upon the gaze location and duration with respect to time or othertime frame. The process transfers the subject's visual recognition dataas metadata to a remove server or memory for storage. In an example, theprocess determines whether the test is complete, or repeats any of theaforementioned steps. The process manually or automatically reviews thesubject's gaze position data. Further details of a test phase and a gazeprocess can be found throughout the present specification and moreparticularly below.

Mazes

In an example, mazes comprise of two types of maze completion tasks. Inthe first type, a participant is required to complete a maze that has noroute choices. In the second type, a participant is required to completea maze that has choice points in order to successfully complete themaze. Outcome variables include gaze location, gaze duration, completiontime, and errors.

In the test phase in FIG. 17, the process provides for a maze. In anexample, the process outputs a maze to a display during a test trial. Inan example, video images of the subject are captured while the subjectis viewing the maze to complete. The process determines the subject'sgaze position based upon a gaze model for the subject and the videoimages. In an example, the process transfers the video images andmetadata to a remote server for storage. In an example, the processdetermines the subject's visual recognition performance based upon thegaze location and duration with respect to time or other time frame. Theprocess transfers the subject's visual recognition data as metadata to aremove server or memory for storage. In an example, the processdetermines whether the test is complete, or repeats any of theaforementioned steps. The process manually or automatically reviews thesubject's gaze position data. Further details of a test phase and a gazeprocess can be found throughout the present specification and moreparticularly below.

Saccades

In an example, saccades is an eye movement-based task. The participantis administered three blocks of trials in which subjects look at afixation point in the center of the tablet screen and move their eyesupon presentation of a presented stimulus. In the first blockparticipants are instructed to follow a stimulus traveling in a patternon the screen (smooth pursuit), in the second block participants areinstructed to move their eyes in the direction of the presented stimulus(pro-saccade). In the third block (anti-saccade), participants areinstructed to move their eyes in the opposite direction of the presentedstimulus.

In the test phase in FIG. 18, the process provides for a saccades. In anexample, the process outputs a visual stimuli to a display during a testtrial. In an example, video images of the subject are captured while thesubject is viewing the visual stimuli on which to fixate. The processdetermines the subject's gaze position based upon a gaze model for thesubject and the video images. In an example, the process transfers thevideo images and metadata to a remote server for storage. In an example,the process determines the subject's visual recognition performancebased upon the gaze location and duration with respect to time or othertime frame. The process transfers the subject's visual recognition dataas metadata to a remove server or memory for storage. In an example, theprocess determines whether the test is complete, or repeats any of theaforementioned steps. The process manually or automatically reviews thesubject's gaze position data. Further details of a test phase and a gazeprocess can be found throughout the present specification and moreparticularly below.

Sustained Attention

In an example, in this task participants are shown a series of eitherletters or numbers and are asked to respond via screen touch or keypress when a specific number, letter, or number or letter combinationhas been displayed. Multiple series of numbers and letters are presentedover a sustained period of time requiring the participant to remainedfocused and attentive. Outcome variables include gaze location, gazeduration, task accuracy, task errors, and reaction time.

In the test phase in FIG. 19, the process provides for a sustainedattention. In an example, the process outputs an alphanumeric stimuli toa display during a test trial. In an example, video images of thesubject are captured while the subject is viewing the picture todescribe. The process determines the subject's gaze position based upona gaze model for the subject and the video images. In an example, theprocess transfers the video images and metadata to a remote server forstorage. In an example, the process determines the subject's visualrecognition performance based upon the gaze location and duration withrespect to time or other time frame. The process transfers the subject'svisual recognition data as metadata to a remove server or memory forstorage. In an example, the process determines whether the test iscomplete, or repeats any of the aforementioned steps. The processmanually or automatically reviews the subject's gaze position data.Further details of a test phase and a gaze process can be foundthroughout the present specification and more particularly below.

Picture Description

In an example, in this task participants are shown a picture and askedto describe everything they see over the course of 1-3 minutes, or otherlength of time. The location and duration of gaze is recorded while theparticipant is describing the picture. Outcome variables include gazelocation, gaze duration, number of words, number of pauses, length ofwords, length of pauses, and syntax of words.

In the test phase in FIG. 20, the process provides for a picturedescription. In an example, the process outputs a picture to a displayduring a test trial. In an example, video images of the subject arecaptured while the subject is viewing the picture to describe. Theprocess determines the subject's gaze position based upon a gaze modelfor the subject and the video images. In an example, the processtransfers the video images and metadata to a remote server for storage.In an example, the process determines the subject's visual recognitionperformance based upon the gaze location and duration with respect totime or other time frame. The process transfers the subject's visualrecognition data as metadata to a remove server or memory for storage.In an example, the process determines whether the test is complete, orrepeats any of the aforementioned steps. The process manually orautomatically reviews the subject's gaze position data. Further detailsof a test phase and a gaze process can be found throughout the presentspecification and more particularly below.

Further details of certain hardware elements and the system can be foundthroughout the present specification and more particularly below.

FIG. 3 illustrates a functional block diagram of various embodiments ofthe present invention. A computer system 600 may represent a desktop orlap top computer, a server, a smart device, tablet, a smart phone, orother computational device. In FIG. 3, computing device 600 may includean applications processor 610, memory 620, a touch screen display 630and driver 640, an image acquisition device (e.g. video camera) 650,audio input/output devices 660, and the like. Additional communicationsfrom and to computing device are typically provided by via a wiredinterface 670, a GPS/Wi-Fi/Bluetooth interface 680, RF interfaces 690and driver 700, and the like. Also included in some embodiments arephysical sensors 710.

In various embodiments, computing device 600 may be a hand-heldcomputing device (e.g. Apple iPad, Amazon Fire, Microsoft Surface,Samsung Galaxy Noe, an Android Tablet); a smart phone (e.g. AppleiPhone, Motorola Moto series, Google Pixel, Samsung Galaxy S); aportable computer (e.g. Microsoft Surface, Lenovo ThinkPad, etc.), areading device (e.g. Amazon Kindle, Barnes and Noble Nook); a headset(e.g. Oculus Rift, HTC Vive, Sony PlaystationVR) (in such embodiments,motion tracking of the head may be used in place of, or in addition toeye tracking); or the like.

Typically, computing device 600 may include one or more processors 610.Such processors 610 may also be termed application processors, and mayinclude a processor core, a video/graphics core, and other cores.Processors 610 may be a processor from Apple (e.g. A9, A10), NVidia(e.g. Tegra), Intel (Core, Xeon), Marvell (Armada), Qualcomm(Snapdragon), Samsung (Exynos), TI, NXP, AMD Opteron, or the like. Invarious embodiments, the processor core may be based upon an ARMHoldings processor such as the Cortex or ARM series processors, or thelike. Further, in various embodiments, a video/graphics processing unitmay be included, such as an AMD Radeon processor, NVidia GeForceprocessor, integrated graphics (e.g. Intel) or the like. Otherprocessing capability may include audio processors, interfacecontrollers, and the like. It is contemplated that other existing and/orlater-developed processors may be used in various embodiments of thepresent invention.

In various embodiments, memory 620 may include different types of memory(including memory controllers), such as flash memory (e.g. NOR, NAND),pseudo SRAM, DDR SDRAM, or the like. Memory 620 may be fixed withincomputing device 600 or removable (e.g. SD, SDHC, MMC, MINI SD, MICROSD, CF, SIM). The above are examples of computer readable tangible mediathat may be used to store embodiments of the present invention, such ascomputer-executable software code (e.g. firmware, application programs),application data, operating system data, images to display to a subject,or the like. It is contemplated that other existing and/orlater-developed memory and memory technology may be used in variousembodiments of the present invention.

In various embodiments, touch screen display 630 and driver 640 may bebased upon a variety of later-developed or current touch screentechnology including resistive displays, capacitive displays, opticalsensor displays, electromagnetic resonance, or the like. Additionally,touch screen display 630 may include single touch or multiple-touchsensing capability. Any later-developed or conventional output displaytechnology may be used for the output display, such as IPS-LCD, OLED,Plasma, or the like. In various embodiments, the resolution of suchdisplays and the resolution of such touch sensors may be set based uponengineering or non-engineering factors (e.g. sales, marketing). In someembodiments of the present invention, a display output port may beprovided based upon: HDMI, DVI, USB 3.X, DisplayPort, or the like.

In some embodiments of the present invention, image capture device 650may include a sensor, driver, lens and the like. The sensor may be basedupon any later-developed or convention sensor technology, such as CMOS,CCD, or the like. In some embodiments, multiple image capture devices650 are used. For example, smart phones typically have a rear-facingcamera, and a front-facing camera (facing the user, as the viewer viewsthe display. In various embodiments of the present invention, imagerecognition software programs are provided to process the image data.For example, such software may provide functionality such as: facialrecognition, head tracking, camera parameter control, eye tracking orthe like as provided by either the operating system, embodiments of thepresent invention, or combinations thereof.

In various embodiments, audio input/output 660 may include conventionalmicrophone(s)/speakers. In some embodiments of the present invention,three-wire or four-wire audio connector ports are included to enable theuser to use an external audio device such as external speakers,headphones or combination headphone/microphones. In some embodiments,this may be performed wirelessly. In various embodiments, voiceprocessing and/or recognition software may be provided to applicationsprocessor 610 to enable the user to operate computing device 600 bystating voice commands. Additionally, a speech engine may be provided invarious embodiments to enable computing device 600 to provide audiostatus messages, audio response messages, or the like.

In various embodiments, wired interface 670 may be used to provide datatransfers between computing device 600 and an external source, such as acomputer, a remote server, a storage network, another computing device600, or the like. Such data may include application data, operatingsystem data, firmware, embodiments of the present invention, or thelike. Embodiments may include any later-developed or conventionalphysical interface/protocol, such as: USB 2.x or 3.x, micro USB, miniUSB, Firewire, Apple Lightning connector, Ethernet, POTS, or the like.Additionally, software that enables communications over such networks istypically provided.

In various embodiments, a wireless interface 680 may also be provided toprovide wireless data transfers between computing device 600 andexternal sources, such as remote computers, storage networks,headphones, microphones, cameras, or the like. As illustrated in FIG. 3,wireless protocols may include Wi-Fi (e.g. IEEE 802.11x, WiMax),Bluetooth, IR, near field communication (NFC), ZigBee and the like.

GPS receiving capability may also be included in various embodiments ofthe present invention, however is not required. As illustrated in FIG.3, GPS functionality is included as part of wireless interface 680merely for sake of convenience, although in implementation, suchfunctionality may be performed by circuitry that is distinct from theWi-Fi circuitry and distinct from the Bluetooth circuitry.

Additional wireless communications may be provided via additional RFinterfaces 690 and drivers 700 in various embodiments. In variousembodiments, RF interfaces 690 may support any future-developed orconventional radio frequency communications protocol, such as CDMA-basedprotocols (e.g. WCDMA), G4, GSM-based protocols, HSUPA-based protocols,or the like. In the embodiments illustrated, driver 700 is illustratedas being distinct from applications processor 610. However, in someembodiments, these functionality are provided upon a single IC package,for example the Marvel PXA330 processor, and the like. It iscontemplated that some embodiments of computing device 600 need notinclude the RF functionality provided by RF interface 690 and driver700.

FIG. 3 also illustrates that various embodiments of computing device 600may include physical sensors 710. In various embodiments of the presentinvention, physical sensors 710 are multi-axis Micro-Electro-MechanicalSystems (MEMS). Physical sensors 710 may include three-axis sensors(linear, gyro or magnetic); three-axis sensors (linear, gyro ormagnetic); six-axis motion sensors (combination of linear, gyro, and/ormagnetic); ten-axis sensors (linear, gyro, magnetic, pressure); andvarious combinations thereof. In various embodiments of the presentinvention, conventional physical sensors 710 from Bosch,STMicroelectronics, Analog Devices, Kionix, Invensense, mCube, or thelike may be used.

In some embodiments, computing device 600 may include a printer 740 forproviding printed media to the user. Typical types of printers mayinclude an inkjet printer, a laser printer, a photographic printer (e.g.Polaroid-type instant photos), or the like. In various embodiments,printer 740 may be used to print out textual data to the user, e.g.instructions; print out photographs for the user, e.g. self-portraits;print out tickets or receipts that include custom bar codes, e.g. QRcodes, URLs, etc.; or the like.

In various embodiments, any number of future developed or currentoperating systems may be supported, such as iPhone OS (e.g. iOS),Windows, Google Android, or the like. In various embodiments of thepresent invention, the operating system may be a multi-threadedmulti-tasking operating system. Accordingly, inputs and/or outputs fromand to touch screen display 630 and driver 640 and inputs/or outputs tophysical sensors 710 may be processed in parallel processing threads. Inother embodiments, such events or outputs may be processed serially, orthe like. Inputs and outputs from other functional blocks may also beprocessed in parallel or serially, in other embodiments of the presentinvention, such as image acquisition device 650 and physical sensors710.

In some embodiments, such as a kiosk-type computing device 600, adispenser mechanism 720 may be provided, as well as an inventory ofitems 730 to dispense. In various examples any number of mechanisms maybe used to dispense an item 730, such as: a gum-ball-type mechanism(e.g. a rotating template), a snack-food vending-machine-type mechanism(e.g. rotating spiral, sliding doors, etc.); a can or bottle soft-drinkdispensing mechanism; or the like. Such dispensing mechanisms are underthe control of processor 610. With such embodiments, a user may walk upto the kiosk and interact with the process described in FIGS. 1A-1E.Based upon the test results, processor may activate the dispensermechanism 720 and dispense one or more of the items 730 to the user inFIG. 1E, steps 410, 430, 440 or 450.

FIG. 3 is representative of one computing device 600 capable ofembodying the present invention. It will be readily apparent to one ofordinary skill in the art that many other hardware and softwareconfigurations are suitable for use with the present invention.Embodiments of the present invention may include at least some but neednot include all of the functional blocks illustrated in FIG. 3. Forexample, in various embodiments, computing device 600 may lacktouchscreen display 1130 and touchscreen driver 1140, or RF interface690 and/or driver 700, or GPS capability, or the like. Additionalfunctions may also be added to various embodiments of computing device600, such as a physical keyboard, an additional image acquisitiondevice, a trackball or trackpad, a joystick, an internal power supply(e.g. battery), or the like. Further, it should be understood thatmultiple functional blocks may be embodied into a single physicalpackage or device, and various functional blocks may be divided and beperformed among separate physical packages or devices.

In some embodiments of the present invention, computing device 600 maybe a kiosk structure. Further, in some instances the kiosk may dispensean item, such as a placebo drug, a drug study medication, differenttypes of foods (e.g. snacks, gum, candies), different types of drinks(e.g. placebo drink, drug study drink), and the like. In some instances,an item may be informational data printed by printer 740 related to theperformance of the user (e.g. life style advice, eating wellinformation, etc.). In additional instances, an item (e.g. a ticket orstub) may include a custom URL, bar code (e.g. 2D bar code, QR code), orthe like, that links to a web site that is has access to the user's testresults. It is contemplated that the linked site may be associated witha testing organization, a drug study site associated with apharmaceutical company, a travel web site, an e-commerce web site, orthe like. In such cases, for privacy purposes, it is contemplated thatthe user will remain anonymous to the linked site, until the userchooses to register their information. In still other instances, an itemmay be a picture of the user (e.g. a souvenir photo, a series of candidphotographs, or the like), in some instances in conjunction with theinformational data or link data described above. In yet otherembodiments, the kiosk may be mobile, and the kiosk may be wheeled up tousers, e.g. non-ambulatory users.

Having described various embodiments and implementations, it should beapparent to those skilled in the relevant art that the foregoing isillustrative only and not limiting, having been presented by way ofexample only. For example, in some embodiments, a user computing devicemay be a tablet or a smart phone, and a front facing camera of such adevice may be used as the video capture device described herein.Additionally, the various computations described herein may be performedby the tablet or smart phone alone, or in conjunction with the remoteserver. Many other schemes for distributing functions among the variousfunctional elements of the illustrated embodiment are possible. Thefunctions of any element may be carried out in various ways inalternative embodiments.

Also, the functions of several elements may, in alternative embodiments,be carried out by fewer, or a single, element. Similarly, in someembodiments, any functional element may perform fewer, or different,operations than those described with respect to the illustratedembodiment. Also, functional elements shown as distinct for purposes ofillustration may be incorporated within other functional elements in aparticular implementation. Also, the sequencing of functions or portionsof functions generally may be altered. Certain functional elements,files, data structures, and so one may be described in the illustratedembodiments as located in system memory of a particular computer. Inother embodiments, however, they may be located on, or distributedacross, computer systems or other platforms that are co-located and/orremote from each other. For example, any one or more of data files ordata structures described as co-located on and “local” to a server orother computer may be located in a computer system or systems remotefrom the server. In addition, it will be understood by those skilled inthe relevant art that control and data flows between and amongfunctional elements and various data structures may vary in many waysfrom the control and data flows described above or in documentsincorporated by reference herein. More particularly, intermediaryfunctional elements may direct control or data flows, and the functionsof various elements may be combined, divided, or otherwise rearranged toallow parallel processing or for other reasons. Also, intermediate datastructures of files may be used and various described data structures offiles may be combined or otherwise arranged.

Further embodiments can be envisioned to one of ordinary skill in theart after reading this disclosure. For example, some embodiments may beembodied as a turn-key type system such as a laptop or kiosk withexecutable software resident thereon. The software is executed by theprocessor of the laptop and provides some, if not all, of thefunctionality described above in FIGS. 1A-1E, such as: calibration ofthe web camera to the subject's face and/or eyes, determination of agaze model, outputting of the familiar and test images at theappropriate times, determining the gaze of the subject for familiar andtest images with respect to time during the test using the gaze model,and the like. The subject test data may be stored locally on the laptopand/or be uploaded to a remote server.

In various embodiments, features, other than just the gaze position forthe user may be utilized. For example, facial expressions (e.g.eyebrows, lip position, etc.) as well as hand placements and gestures ofa user may also be considered (e.g. surprise, puzzled, anger,bewilderment, etc.) when determining whether the cognitive performanceof a user. In other embodiments, additional eye-related factors may alsobe detected and used, such as: blink rate of the user, pupil dilation,pupil responsiveness (e.g. how quickly the pupil dilates in response toa flash on the display), saccadic movement, velocity, and the like.

In still other embodiments, the method performs a treatment or furtheranalysis using information from the analysis and/or diagnostic methodsdescribed above. In an example, the treatment or further analysis caninclude, an MRI scan, CAT scan, x-ray analysis, PET scans, a spinal tap(cerebral spinal fluid) test (amyloid plaque, tau protein), abeta-amyloid test, an MRT blood test, and others. In order to initiateany of the analysis and/or diagnostic method includes using theinformation to open a lock or interlock to initiate the analysis and/ordiagnostic method. In an example, treatment can include automated ormanual administration of a drug or therapy.

As an example discussed above, in kiosk embodiments, the treatmentincludes using the user's cognitive performance information to accessthe drug, which is under a lock or secured container, and to dispensethe drug. In another example, a PET/MRI scans are provided for anamyloid plaque. In an example, the treatment can include a spinal tapfor cerebral spinal fluid to measure amyloid plague and tau (proteinbelieved to be involved in Alzheimer's disease). Of course, there can beother variations, modifications, and alternatives. In yet anotherexample, treatment can include a physician that places the patient on anAlzheimer's drug such as Namenda™, Exelon™, among others. In an example,the patient can be treated using wearable devices such as a Fitbit™ totrack exercise, movement, and sleep activity.

In an example, the method provides results that are preferably storedand secured in a privileged and confidential. In an example, the resultsand/or information is secured, and subject to disclosure only byunlocking a file associated with the information. In an example, aphysician or health care expert can access the results after thesecurity is removed.

In an example, the image can also be configured to capture anotherfacial element. The facial element can include a mouth, nose, cheeks,eye brows, ears, or other feature, or any relations to each of thesefeatures, which can be moving or in a certain shape and/or place, toidentify other feature elements associated with an expression or otherindication of the user. Of course, there can be other variations,modifications, and alternatives.

In an example, the image can also be configured to capture anotherelement of known shape and size as a reference point. In an example, theelement can be a fixed hardware element, a piece of paper, or code, orother object, which is fixed and tangible. As an example, a doctor,pharmaceutical, or the like may provide the user with a business card orother tangible item, that has a unique QR code imprinted thereon. Invarious embodiments, the user may display the QR code to the camera, forexample in FIG. 1A, step 120. In other embodiments, the remote servermay use the QR code to determine a specific version of the cognitivetest described therein and provide specific prizes, gifts, andinformation to the user. As an example, Pharmaceutical A may have a 6minute visual test, based upon colored images, whereas Researchers B mayhave a 5 minute visual test, based upon black and white images, etc. Itshould be understood that many other adjustments may be made to theprocess described above, and these different processes may beimplemented by a common remote server. Of course, there can be othervariations, modifications, and alternatives.

In other examples, the present technique can be performed multipletimes. In an example, the multiple times can be performed to create abaseline score. Once the base line score is stored, and other tests canbe performed at other times to reference against the base line score. Inan example, the base line score is stored into memory on a secured severor client location. The base line score can be retrieved by a user, andthen processed along with new test scores to create additional scores.Of course, there can be other variations, modifications, andalternatives.

In an alternative example, the present technique can be used to identifyother cognitive diseases or other features of the user, such as:anxiety, stress, depression, suicidal tendencies, childhood development,and the like. In an example, the technique can be provided on a platformfor other diseases. The other diseases can be provided on variousmodules, which are included. In still other embodiments, the disclosedtechniques may be used as a platform for other user metrics, e.g. usermotion or gait capture and analysis, user pose or posture analysis.Embodiments may be located at hospitals and when users take the test andfail to show sufficient novelty preference, a directory of specificdoctors or departments may become unlocked to them. Users who showsufficient novelty preference may not have access to such providers.

In some embodiments, user response to different pictures may be used forsecurity purposes (e.g. TSA, CIA, FBI, police) or the like. As anexample, during a testing phase, the user may be displayed a familiarimage that is neutral, such as a flower or stop sign, and be displayed anovel image that illustrates violence, such as an AK-47 gun, a bomb, a9-11 related image, or the like. In some cases, deliberately avoidslooking at the novel image may be considered a security risk. Otherembodiments may be used in motor vehicle department for determiningwhether older drivers have sufficient cognitive performance to safelyhandle a vehicle.

In some examples, algorithms may be implemented to determine whether auser is attempting to fool the system. For example, gaze analysis may beused to determine if the user is trying to cover up a cognitiveshortcoming.

In an example, the present technique can be implemented on a stand-alonekiosk. In an example, the camera and other hardware features, can beprovided in the kiosk, which is placed strategically in a designatedarea. The kiosk can be near a pharmacy, an activity, or security zone,among others. In an example, the technique unlocks a dispenser toprovide a drug, the technique unlocks a turnstile or security gate, orthe like, after suitable performance of the technique. Of course, therecan be other variations, modifications, and alternatives.

In an example, the technique can also be provided with a flash or otherillumination technique into each of the eyes.

In other embodiments, combinations or sub-combinations of the abovedisclosed invention can be advantageously made. The block diagrams ofthe architecture and flow charts are grouped for ease of understanding.However it should be understood that combinations of blocks, additionsof new blocks, re-arrangement of blocks, and the like are contemplatedin alternative embodiments of the present invention. Further examples ofembodiments of the present invention are provided below.

FIG. 5 is a simplified flow diagram of a process according to anembodiment of the present invention. This diagram is merely an example,which should not unduly limit the scope of the claims herein. Referringto the Figure, in an example, the present invention provides a methodfor identifying a feature of an eye of a human user. The method includesinitiating an image capturing device, such as a camera or other imagingdevice. In an example, the image capturing device comprises a pluralityof sensors arranged in an array. In an example, the method includescapturing video information from a facial region of the human user usingthe image capturing device. In an example, the video information is froma stream of video comprising a plurality of frames.

In an example, the image capturing device is provided in a computer, amobile device, a smart phone, or other end user computing device. In anexample, the plurality of video frames is one of a plurality of images.In an example, each of the plurality of images that have been parsedcomprises RGB information associated with the human user.

In an example, the method includes processing the video information toparse the video information, frame by frame, into the plurality ofimages. In an example, each of the plurality of images has a time stampfrom a first time stamp, a second time stamp, to an Nth time stamp,where N is greater than 10, or other number. Of course, there can beother variations, modifications, and alternatives.

In an example, the method includes processing each of images to identifya location of the facial region and processing each of the images withthe location of the facial region to identify a plurality of landmarksassociated with the facial region. In an example, the facial region canbe identified using a matching or processing technique. Landmarks caninclude other facial features, such as a mouth, cheeks, nose, ears, andother facial features. In an example, the method includes processingeach of the images with the location of the facial regions and theplurality of landmarks to isolation a region including each of the eyes.In an example, the processing identifies the region including the eyes.The method includes processing each of the regions, frame by frame, toidentify a pupil region for each of the eyes. In an example, the regionis configured as a rectangular region having an x-axis and a y-axis toborder each of the eyes of the human user. In an example, the spatiallocation of the pupil is desirable for gaze analysis.

In an example, the processing comprises a variety of steps. In anexample, processing includes processing the region using a grayscaleconversion to output a grayscale image; processing the grayscale imageusing an equalization process to output an equalized image. In anexample, the equalized image uses a histogram equalization, which is amethod in image processing of contrast adjustment using the image'shistogram. The processing also includes processing the equalized imageusing a thresholding process to output a thresholded image; processingthe thresholded image using a dilation and erosion (e.g., stripping awayborder pixels) process to output a dilated (e.g., adding border pixels)and eroded image; and processing the dilated and eroded image using acontour and moment process to output a finalized processed image. Ofcourse, there can be other variations, modifications, and alternatives.

In an example, the method includes processing the finalized processedimage to identify (using a spatial coordinate system) a spatial locationof each pupil in the region, each pupil being identified by atwo-dimensional spatial coordinate. The method includes processinginformation associated with each pupil identified by the two-dimensionalspatial coordinate to output a plurality of two-dimensional spatialcoordinates, each of which is in reference to a time, in atwo-dimensional space. That is, each two dimensional spatial coordinatein reference to time, in a two dimensional space represent a location ofthe pupils. The plurality of the two dimensional coordinates each ofwhich is in reference to time represents gaze information. The methodthen includes outputting a gaze information about the human user. Thegaze information includes the two dimensional spatial coordinates eachof which is in reference to a time in a two dimensional space. In anexample, each of the steps can be stored in temporary or permanentmemory, and can be accessed from time to time. In an example, theprocessing device coordinates the processing in conjunction with animage processing device or other processor. Of course, there can beother variations, modifications, and alternatives.

In an example, the method further includes associating the gazeinformation with one or more cognitive assessment constructs. In anexample, each of the cognitive assessment constructs being stored inmemory of a computing device.

In an example, the method further includes transferring the videoinformation through a network to a server device, the server devicebeing coupled to the network.

In an example, the processing the region using the grayscale conversionretrieved from a library provided in memory to output a grayscale image;wherein the processing the grayscale image using the equalizationprocess retrieved from a library provided in the memory to output anequalized image; wherein processing the equalized image using thethresholding process retrieved from a library provided in the memory tooutput a thresholded image; wherein the processing the thresholded imageusing a dilation and erosion process retrieved from a library frommemory to output a dilated and eroded image; and wherein the processingthe dilated and eroded image using a contour and movement processretrieved from a library from the memory to output a finalized processedimage.

In an example, the method further comprising using the gaze informationto associate the gaze information with a cognitive learning feature. Inan example, the method further comprising using the gaze information toassociate the gaze information with a cognitive learning feature, thecognitive learning feature being one of a plurality of learningdisorders.

In an example, the invention includes an alternative method foridentifying a feature of an eye of a human user. The method includesinitiating an image capturing device, the image capturing devicecomprising a plurality of sensors arranged in an array. The methodincludes capturing information from a facial region of the human userusing the image capturing device, the information comprising a pluralityof frames. The method includes processing the information to parse theinformation into the plurality of images, each of the plurality ofimages having a time stamp from a first time stamp, a second time stamp,to an Nth time stamp, where N is greater than 10.

In an example, the method includes processing each of images to identifya location of the facial region and processing each of the images withthe location of the facial region to identify a plurality of landmarksassociated with the facial region. The method includes processing eachof the images with the location of the facial regions and the pluralityof landmarks to isolation a region including each of the eyes andprocessing each of the regions, frame by frame, to identify a pupilregion for each of the eyes.

In an example, the processing comprises at least: processing the regionusing a grayscale conversion to output a grayscale image; processing thegrayscale image using an equalization process to output an equalizedimage; processing the equalized image using a thresholding process tooutput a thresholded image; processing the thresholded image using adilation and erosion process to output a dilated and eroded image; andprocessing the dilated and eroded image using a contour and momentprocess to output a finalized processed image.

In an example, the method includes processing the finalized processedimage to identify a spatial location of each pupil in the region, eachpupil being identified by a two-dimensional spatial coordinate andprocessing information associated with each pupil identified by the twod-dimensional spatial coordinate to output a plurality oftwo-dimensional spatial coordinates, each of which is in reference to atime, in a two-dimensional space. The method includes outputting a gazeinformation about the human user. In an example, the gaze informationincludes the plurality of two dimensional spatial coordinates, each ofwhich is in reference to a time, in a two dimensional space.

FIG. 6 is a simplified flow diagram of a process according to analternative embodiment of the present invention. In an example, theprocess is provided to process information associated with a gaze of asubject or human user. In an example, once the video has been captured,the process exams the video. The process forms a trial video, and parsesout trial frames, as shown. In an example, the process identifies pupilposition for the gaze process. The pupil positions are plotted, andfiltered to remove outliers. Each of the pupil positions are provided ina spatial coordinate with a selected range. The process scores the trialand then the exam. Of course, there can be other variations,modifications, and alternatives.

In an example referring to FIG. 21, the present technique provides anapparatus for identifying a feature of an eye of a human user. Inapparatus has a processing device, a memory device coupled to theprocessing device, and an image capturing device coupled to theprocessing device. In an example, the image capturing device comprises aplurality of sensors arranged in an array. In an example, the imagecapturing device is initiated by a processing device and is configuredto capture information from a facial region of the human user using theimage capturing device, the information comprising a plurality offrames.

In an example, the apparatus has an image processing module coupled tothe processing device. In an example, the image processing module ordevice, also known as an image processing engine, image processing unit(IPU), or image signal processor (ISP), is a type of media processor orspecialized digital signal processor (DSP) used for image processing, indigital cameras or other devices. In an example, image processors oftenemploy parallel computing to increase speed and efficiency. The digitalimage processing engine can perform a range of tasks. To increase thesystem integration on embedded devices, often it is a system on a chipwith multi-core processor architecture. In an example, the imageprocessing module configured to: process the information to parse theinformation into the plurality of images, each of the plurality ofimages having a time stamp from a first time stamp, a second time stamp,to an Nth time stamp, where N is greater than 10 or other time frame;processing each of images to identify a location of the facial region;process each of the images with the location of the facial region toidentify a plurality of landmarks associated with the facial region;process each of the images with the location of the facial regions andthe plurality of landmarks to isolation a region including each of theeyes.

In an example, the image processing module is configured to process eachof the regions, frame by frame, to identify a pupil region for each ofthe eyes, the process of each of the regions, frame by frame, toidentify the pupil region for each of the eyes comprising at least:processing the region using a grayscale conversion to output a grayscaleimage; processing the grayscale image using an equalization process tooutput an equalized image; processing the equalized image using athresholding process to output a thresholded image; processing thethresholded image using a dilation and erosion process to output adilated and eroded image; and processing the dilated and eroded imageusing a contour and moment process to output a finalized processedimage.

In an example, the module is further configured to process the finalizedprocessed image to identify a spatial location of each pupil in theregion, each pupil being identified by a two-dimensional spatialcoordinate; and process information associated with each pupilidentified by the two d-dimensional spatial coordinate to generate aplurality of two-dimensional spatial coordinates, each of which is inreference to a time, in a two-dimensional space.

In an example, the apparatus has an output handler to output a gazeinformation associated the plurality of two-dimensional spatialcoordiates, each of which is in reference to the time, in the twodimensional space, about the human user. In an example, the gazeinformation is stored in the memory device. Of course, there can beother variations, modifications, and alternatives.

The specification and drawings are, accordingly, to be regarded in anillustrative rather than a restrictive sense. It will, however, beevident that various modifications and changes may be made thereuntowithout departing from the broader spirit and scope of the invention asset forth in the claims.

What is claimed:
 1. A method for identifying a feature of an eye of ahuman user, the method comprising: initiating an image capturing deviceunder control of a processing device, the image capturing devicecomprising a plurality of sensors arranged in an array; capturing videoinformation from a facial region of the human user using the imagecapturing device, the video information being a stream of videocomprising a plurality of frames; processing the video information usingan image processing device, to parse the video information into theplurality of images, each of the plurality of images having a time stampfrom a first time stamp, a second time stamp, to an Nth time stamp,where N is greater than 10; processing each of images using the imageprocessing device to identify a location of the facial region;processing each of the images with the location of the facial region toidentify a plurality of landmarks associated with the facial region;processing each of the images with the location of the facial regionsand the plurality of landmarks to isolation a region including each ofthe eyes; processing each of the regions, frame by frame, using theimage processing device, to identify a pupil region for each of theeyes, the processing comprising at least: processing the region using agrayscale conversion to output a grayscale image; processing thegrayscale image using an equalization process to output an equalizedimage; processing the equalized image using a thresholding process tooutput a thresholded image; processing the thresholded image using adilation and erosion process to output a dilated and eroded image; andprocessing the dilated and eroded image using a contour and momentprocess to output a finalized processed image; processing the finalizedprocessed image to identify a spatial location of each pupil in theregion, each pupil being identified by a two-dimensional spatialcoordinate; processing information associated with each pupil identifiedby the two d-dimensional spatial coordinate to output a plurality oftwo-dimensional spatial coordinates, each of which is in reference to atime, in a two-dimensional space; and outputting a gaze informationabout the human user.